.\" $Id$
.\"

.TH SQLOG 1 "SLURM Query Log"

.SH NAME
sqlog \- SLURM query log utility

.SH SYNOPSIS
.B sqlog
[\fIOPTIONS\fR]...

.SH DESCRIPTION
The \fBsqlog\fR utility provides a single interface to query information
about jobs from the SLURM job log database and/or the current queue
of running jobs. 

By default both the current queue of running jobs and the database
of completed jobs are queried, and a limit of 25 results is displayed.
If more results are available in the database or queue, sqlog will
note the fact with an informational message to stderr:
.nf 

    sqlog: [More results available....]

.fi 
This message is suppressed if the \fB--no-header\fR option is provided.

.SH CONFIGURATION

\fBsqlog\fR reads configuration from the \fBsqlog.conf\fR config file
(typically in /etc/slurm). This config file provides information about
the SLURM job log database location and username.  In addition,
\fBsqlog\fR  reads user defaults and additional output format types
from a ~/.sqlog file if it exists. See the USER CONFIG section
below for more information.

Various config parameters are set in the following order:
Internal defaults, system configuration, user configuration, 
and command-line.

.SH OPTIONS
.TP
.BI "-h, --help"
Display a summary of the command-line options.
.TP
.BI "-v, --verbose"
Increase debugging verbosity of the program.
.TP
.BI "--dry-run"
Don't actually do anything.
.TP
.BI "-j, --jobids " LIST
Provide a comma-separated list of jobids to include (or exclude if
preceded by the \fI-x\fR, \fI--exclude\fR option). This option may
be specified multiple times.
.TP
.BI "-J, --job-names " LIST
Provide a comma-separated list of job names to include (or exclude if
preceded by the \fI-x\fR, \fI--exclude\fR option). This option may
be specified multiple times.
.TP
.BI "-n, --nodes " LIST
Provide a comma-separated list of nodes or node lists to include 
(or exclude if preceded by the \fI-x\fR, \fI--exclude\fR option).
This option may be specified multiple times. Node lists can be
in hostlist format, e.g. host[34-36,67].
.TP
.BI "-p, --partitions " LIST
Provide a comma-separated list of partitions to include (or exclude if
preceded by the \fI-x\fR, \fI--exclude\fR option). This option may
be specified multiple times.
.TP
.BI "-s, --states " LIST
Provide a comma-separated list of job states to include (or exclude if
preceded by the \fI-x\fR, \fI--exclude\fR option). This option may
be specified multiple times. Use --states=list to generate list of valid
job state keys.
.TP
.BI "-u, --users " LIST
Provide a comma-separated list of users to include (or exclude if
preceded by the \fI-x\fR, \fI--exclude\fR option). This option may
be specified multiple times.
.TP
.BI "--regex"
Enable regular expressions instead of exact matching for the following list
of jobids, job names, partitions, states, and user names. For example
.nf

     sqlog --regex --job-names='^foo-.*'

.fi
.TP
.BI "-x, --exclude"
Exclude the following list of jobids, users, partitions, nodes, job names,
or job states. For example: \fB--exclude --jobids\fR=\fILIST\fR or 
\fB-xn\fR \fINODES\fR.
.TP
.BI "-N, --nnodes " N
List jobs that ran on exactly \fIN\fR nodes. \fIN\fR may also have the 
form +\fIN\fR, -\fIN\fR, or \fIN\fR..\fIM\fR, to specify a minimum, 
maximum, or range of node counts. For examples see RANGE OPERATORS
section below.
.TP
.BI "--minnodes " N
Explicity specify a minimum node count. This is equivalent to using
the option \fB--nnodes\fR=+\fIN\fR.
.TP
.BI "--maxnodes " N
Explicity specify a maximum node count. This is equivalent to using
the option \fB--nnodes\fR=-\fIN\fR.
.TP
.BI "-C, --ncores " N
List jobs that ran on exactly \fIN\fR cores. \fIN\fR may also have the 
form +\fIN\fR, -\fIN\fR, or \fIN\fR..\fIM\fR, to specify a minimum, 
maximum, or range of node counts. For examples see RANGE OPERATORS
section below.
.TP
.BI "--mincores " N
Explicity specify a minimum core count. This is equivalent to using
the option \fB--ncores\fR=+\fIN\fR.
.TP
.BI "--maxcores " N
Explicity specify a maximum core count. This is equivalent to using
the option \fB--ncores\fR=-\fIN\fR.
.TP
.BI "-T, --runtime " DURATION
List jobs that ran or have run for \fIDURATION\fR. \fIDURATION\fR may
have the form DD-HH:MM:SS or DDdHHhMMmSSs, where DD is days, HH
hours, MM minutes, and SS seconds. In the second form, values that
are zero may be left out, e.g. 4h. In the first form, days, hours,
and minutes are optional, e.g. :04 is 4 seconds. \fIDURATION\fR may
be specified as +\fIT\fR, -\fIT\fR or \fIT1\fR..\fIT2\fR to specify
a min, max, or runtime range. For more information, see the RANGE
OPERATORS section below.
.TP
.BI "--mintime " DURATION
List all jobs that ran for at least \fIDURATION\fR.
This is equivalent to specifying \fB--runtime\fR=+\fIDURATION\fR.
.TP
.BI "--maxtime " DURATION
List all jobs that ran for at most \fIDURATION\fR.
This is equivalent to specifying \fB--runtime\fR=-\fIDURATION\fR.
.TP
.BI "-t, --time, --at " TIME
List jobs which were running at a particular date and time.
\fITIME\fR arguments are parsed by the perl \fBDate::Manip\fR(3pm)
package, so many date/time formats are allowed, i.e. 2pm or
"4/14 15:30:00". A window of time may be specified by separating the
start and end of the window with "..", e.g. \fB--time\fR=2pm..3pm.
For more information see the RANGE OPERATORS section below.
.TP
.BI "-S, --start " TIME
List all jobs that started at date and time \fITIME\fR. \fITIME\fR may
have the form +\fIT\fR, -\fIT\fR, or \fIT1\fR..\fIT2\fR to specify a
minimum, maximum, or start time range. See RANGE OPERATORS below
for more information.
.TP
.BI "--start-after " TIME
List all jobs that started after time \fITIME\fR. This is equivalent
to using \fB--start\fR=+\fITIME\fR.
.TP
.BI "--start-before " TIME
List all jobs that started before time \fITIME\fR. This is equivalent
to using \fB--start\fR=-\fITIME\fR.
.TP
.BI "-E, --end " TIME
List all jobs that ended at date and time \fITIME\fR. \fITIME\fR may
have the form +\fIT\fR, -\fIT\fR, or \fIT1\fR..\fIT2\fR to specify a
minimum, maximum, or end time range (see RANGE OPERATORS below for
more information). For running jobs, SLURM uses
an estimated end time, so end times in the future are valid and will
be used. (and using \fB--end\fR=+now would list all currently 
running jobs, since they all end in the future.)
.TP
.BI "--end-after " TIME
List all jobs that ended (or will end) after time \fITIME\fR. This is 
equivalent to using \fB--end\fR=+\fITIME\fR.
.TP
.BI "--end-before " TIME
List all jobs that ended (or will end) before time \fITIME\fR. This is 
equivalent to using \fB--end\fR=-\fITIME\fR.
.TP
.BI "-X, --no-running" 
Do not query running jobs, i.e. ignore the current queue and only
query the SLURM job log database.
.TP
.BI "--no-db"
Do not query the SLURM job log database, i.e. only query the current
queue.
.TP
.BI "-H, --no-header"
Do not display header rows in output.
.TP
.BI "-o, --format " LIST
Specify a list of format keys to display or a format type, or both
using the form \fITYPE\fR:\fIKEYS\fR... Use \fB--format\fR=list to
list valid keys and types. See OUTPUT FORMAT below for further 
information.
.TP
.BI "-P, --sort " LIST
Specify a list of keys on which to sort output. Prepend a '-' to sort
in descending as opposed to ascending order. List valid keys
using \fB--sort\fR=list. The default sort method is '-start'. 
.TP
.BI "-L, --limit " N
Limit the number of records to report (Default = 25).
.TP
.BI "-a, --all"
Do not limit the number of returned results. (Return all matching rows).
This is equivalent to \fB--limit\fR=0.

.SH RANGE OPERATORS
\fITIME\fR, \fIDURATION\fR, and numeric arguments may use the
RANGE OPERATORS '+', '-', and '..' to specify minimum, maximum,
or a range of values respectively.  TIME arguments may also use
the '@' symbol to escape a leading + or - in the TIME itself
(e.g. '-1hr' means '1 hr ago').  The \fB--time\fB, \fB--start\fR,
\fB--end\fR, \fB--runtime\fB, and \fB--nnodes\fR options to
\fBsqlog\fR all take RANGE OPERATORS.
.TP
Examples 
.TP 20
.BI "--nnodes " +8
Jobs that ran with 8 or more nodes.
.TP
.BI "--nnodes " 16..32
Jobs that ran with between 16 and 32 nodes, inclusive.
.TP
.BI "--runtime " -2h
Jobs that ran for 2 hours or less.
.TP
.BI "--runtime " 5m..1hr
Jobs that ran for between 5 minutes and 1 hour, inclusive.
.TP
.BI "--end " 2pm..3pm
Jobs that ended today between 2PM and 3PM, inclusive.
.TP
.BI "--time " 7/17..7/18
Jobs that ran anytime from 12AM, 7/17 to 12AM, 7/18.
.TP
.BI "--time " "+'1 hour ago'"
Jobs that ran in the past hour (1 hour ago or later).
.TP
.BI "--time " "+-1hr (or +@-1hr)"
Same as above.
.TP
.BI "--time " @-1hr
Jobs that were running exactly at one hour ago.
.TP
.BI "--time " @-2hr..-1hr
Jobs that were running between 2 hours ago and 1 hour ago.


.SH USER CONFIGURATION
When \fBsqlog\fR runs, it will first check for a ~/.sqlog file and 
parse it if it exists. At this time, the ~/.sqlog file may be used 
to set a new default limit (see \fB--limit\fR) and addtional output format 
types (see \fB--format\fR). These two configuration parameters take the form:
.TP 20
\fBlimit\fR = \fIN\fR
Set the new default output limit to \fIN\fR.
.TP
\fBformat{\fINAME\fB}\fR = \fILIST...\fR
Create an alias \fINAME\fR for the format list \fILIST\fR.
.PP
For example, the following sqlog file
.nf
    #  Sample ~/.sqlog file
    limit = 30
    format{mine} = long:start,end,jobid,user,state

.fi
would set the default output limit to 30 records and 
add a new format type \fImine\fR. The new format type would 
be used by specifying 
.nf

    \fB--format\fR \fImine\fR

.fi 
on the command line, which would be equivalent to 
.nf

    \fB--format\fR long:start,end,jobid,user,state

.fi
Any number of format types may be specified in this way, though
if there are duplicate names, the last one specified will override
all previous types. This also implies that a user can redefine
the default \fBsqlog\fR format types \fIshort\fR, \fIlong\fR,
and \fIfreeform\fR, though this is not recommended.

.SH OUTPUT FORMAT
\fBsqlog\fR provides precise control over the output format, which aids with
readability and simplifies parsing via scripts.  When parsing output, be sure
to specify each field and the expected order using the -o,--format option.
The built-in formats (short, long, and freeform) may add or reorder fields
over time.

By default, \fBsqlog\fR uses the output format 
.nf

   short:jobid,partition,name,user,state,start,runtime,ncores,nnodes,nodes

.fi

The \fIshort:\fR preceeding the format specification tells \fBsqlog\fR
to use the \fIshort\fR form of each of the format keys. The result
is what you see when running \fBsqlog\fR without using the -o,--format
option. All format keys currently available are detailed here. Some
keys have shorter aliases that are provided for convenience. These
are listed alongside the full key name below. Note that all these
keys can also be listed by using --\fIformat=list\fR.
.TP 20
.B "jobid | jid"
The SLURM jobid for this job.
.TP
.B "partition | part"
The SLURM partition in which the job ran or is running.
.TP 
.B "name"
The name of the job as recorded by SLURM.
.TP
.B "user"
The username of the user running the job.
.TP 
.B "state | st"
The current or final state of the job. See JOB STATE CODES
for a description of the two-letter codes that this field
displays by default.
.TP 
.B "start"
The start time of the job in the form MM/DD-HH:MM:SS.
.TP 
.B "runtime | time"
The total runtime of the job in the form
DD-HH:MM:SS. Leading zero values may be dropped,
for instance 4:30 is 4 minutes 30 seconds.
.TP 
.B "ncores | C"
The number of cores allocated to the job.
.TP 
.B "nnodes | N"
The number of nodes allocated to the job.
.TP 
.B "nodes"
The nodelist that was allocated to the job. Note that for
completing jobs (CG) this nodelist will be restricted to
the currently completing nodes for the job. To see the
full nodelist, restrict \fBsqlog\fR to the database only,
i.e. run with the -X, --no-running option.
.TP 
.B "runtime_s | time_s"
The total job runtime in seconds.
.TP 
.B "end"
The time at which the job completed in the form
MM/DD-HH:MM:SS.
.TP
.B "longstart"
Date and time the job started in the form
YYYY-MM-DDTHH:MM:SS. This is displayed by
default in the \fIlong\fR format type.
.TP 
.B "longend"
Date and time the job ended in the form
YYYY-MM-DDTHH:MM:SS. This is displayed by
default in the \fIlong\fR format type.
.TP 
.B "unixstart"
Job start time in seconds since epoch.
.TP 
.B "unixend"
Job end time in seconds since epoch.
.TP 0

A format type may be specified in addition to the format fields.  These change the output width and in some cases the output format of the fields above. The format type may also be specified alone to the \fI--format\fR option. For instance \fI--format=long\fR would choose the default fields configured for the \fIlong\fR format type.

.TP 20
.B "short"
This is the default output type. It uses the format fields:
jobid,part,name,user,state,start,runtime,ncores,nnodes,nodes
.TP
.B "long"
This format type uses longer widths for most fields, and
displays the the full job state code by default (e.g.
completing instead of CG). Its default format fields are:
jobid,part,name,user,state,longstart,longend,runtime,ncores,nnodes,nodes
.TP
.B "freeform"
This is a freeform output in which full width fields are displayed
separated by whitespace. This would be used for parsing sqlog
output for instance, to guarantee no field is trunctated.
It uses the same format fields as the \fBlong\fR format type.

.SH JOB STATE CODES
In normal output, job states are displayed with two letter abbreviations
in \fBsqlog\fR output. Job state codes are fully explained in the
\fBsqueue\fR(1) man page, but the abbreviations are restated here
for completeness.
.TP 20 
.B "CA   CANCELLED"
Job was cancelled.
.TP
.B "CD   COMPLETED"
Job completed normally.
.TP
.B "CG   COMPLETING"
Job is in the process of completing.
.TP
.B "F    FAILED"
Job termined abnormally.
.TP
.B "NF   NODE_FAIL"
Job terminated due to node failure.
.TP
.B "PD   PENDING"
Job is pending allocation.
.TP 
.B "R    RUNNING"
Job currently has an allocation.
.TP
.B "S    SUSPENDED"
Job is suspended.
.TP 
.B "TO   TIMEOUT"
Job terminated upon reaching its time limit.


.SH EXAMPLES
Display the job or jobs that were running on host55 at July 19, 4:00PM:
.nf

    sqlog --time="July 19, 4pm" --nodes=host55

.fi
Display at most 25 jobs that were running at midnight yesterday:
.nf

    sqlog --time=yesterday,midnight

.fi
Display all jobs that failed between 8:00AM and 9:00AM this morning,
sorted by descending endtime:
.nf

    sqlog --all --end=8am..9am --states=F --sort=-end

.fi 
Display all jobs that started today:
.nf

    sqlog --start=+midnight --all

.fi
Display all jobs that have run between 3 and 4 hours on the nodes
host30 through host65, and that didn't complete normally
.nf

   sqlog -L 0 -T=3h..4h -n 'host[30-65]' -xs completed

.fi  
Display all jobs that were running yesterday with 1000 nodes or 
greater and completed normally:
.nf

    sqlog -t yesterday,12am..12am -s CD -N +1000

.fi
List current queue, sorted by number of nodes (ascending):
.nf

    sqlog --all --no-db --sort=nnodes

.fi
List the top 10 longest running jobs, and then the 5 oldest jobs:
.nf

    sqlog --sort=runtime --limit=10
    sqlog --sort=-start --limit=5
	
.fi
.SH AUTHOR
Written by Adam Moody and Mark Grondona.
